Back
Derives Bellman Optimality and fixed-point properties. Analyzes Value Iteration (contraction mapping) and how models/rewards determine the optimal policy.
reinforcement learning
bellman optimality
value iteration
study notes